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ABSTRACT 

As part of an effort to develop strategies for 
measuring institutional effectiveness, a study was conducted at West 
Virginia University at Parkersburg (WVU-P) , a two-year institution 
with an enrollment of nearly 4,000 students, to determine the most 
appropriate instrument for measuring gains in general education 
between incoming freshman (i.e., with 15 or fewer credit hours) and 
near completers (i.e., with more than 45 but fewer than 75 credit 
hours). WVU-P reviewed five assessment instruments available for 
measuring general education attainment, electing to pilot test short 
forms of the Academic Profile (AP) and the College BASE (CB) . The AP 
was administered to 69 students and the CB was administered to 48, 
while the total sample population had a mean age of 24.6 and was 
57.47. female. The mean score for the AP was 444, with a standard 
deviation of 15.7, compared to a mean AP score of 443 nationwide. The 
CB mean score for English was 268, with a standard deviation of 64, 
and 196 for mathematics, with a standard deviation of 46. Test scores 
indicate that entering WVU-P students are below the national average 
in general education, but after completion of 45 credit hours, 
students are generally above the national average. Moreover, the AP 
was found to be a more consistent and statistically meaningful 
measure than the CB with respect to differences between incoming 
freshmen and near-comp i e t ers in general education. Report of Student 
Outcome Assessment Tests Analysis is appended. Contains eight 
references. (KP) 



ji { y c y ; y f y c j c y ; y c y f y c y ; y f y ; y ; y ; y c y f y c y c y- y c y c y c y- * is it it is it it it is it it is is is it V? * tfc Vc Vc Vc V? Vc Vc it Vc Vc Vc is ;'c It Vc is it it it it it it it it it it it 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 

y c y f y c y c y f y c y c y ? y. y ? y ? y ? y f y ? y ? y f y f y c y c y f y c y r it it it it it it it it it it it it it it it it it it it it it it ii it it it it it it it it it it it it it it it it it it it it it it it > 



Assessing General Education: Selection and Implementation 
of an Instrument to Satisfy 



o 

<N 
CO 

CO 

co 

Q 
LU 



Internal and External Constituencies 



HongYu Chan 
Director, Institutional Resaarch 
Phil O. McClung 
Chair, outcomes Assessment Committee 
Associate Professor in Psychology 
Bldon Millar 



US DEPARTMENT OF EDUCATION 
Qtttce o( Educational Research and improvement 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC/ 



eproduceo as 
>r organization 



>^Thi& documam has been 
>ece<ved f'o^ the person 
originating it 
C Mmoi changes have beer, mad© to improve 

(©Production Quality 

• Pomts oi vie* or opinions stated this docu 

n\en\ dO not necessarily represent official 
OFRI position o» polity 



Campus President 

WVXJ at Parkersburg 

Parkersburg, WV 2 6101 



■ PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BERN GRANTED BY 

H« Chen 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



Presented at the 

Southeastern Association for Community College Research 
Twenty- third Annual Conference 

or 

i£ Savannah , Georgia 



j 



ERLC 



August 2, 1994 

2 

BEST COPY AVAILABLE 



Content 



Table of Content 



Page 



Sample Selection 3 

Pilot Testing 3 

Scoring 4 

Results , 4 

Data Analysis 4 

Analysis on the Academic Profile 7 

Analysis on the College BASE 10 

Conclusion 13 

References ...... 17 

Appendix A 19 

/ 



i 3 

o 

ERLC 



List of Tables 



Table Page 



1. Percent of Students Who Reached in 

Each level for the College BASE 5 

2. Descriptive Information of 

Grouping for Academic Profile 8 

3. ANOVA Table for Academic Profile 11 

4. Post Hoc Power Estimation 12 

5. Descriptive Information of 

Grouping for College BASE 14 

6. ANOVA Results for College BASE 16 



ii 4 



List of Figures 

Figure Page 



1. WVU-P Students Proficiency Level on College BASE . 6 

2. WVU-P Student Performance on Academic Profile ... 9 

3. WVU-P Student Performance on College BASE 15 



Assessing General Education: Selection and Implementation 

of an Instrument to 
Satisfy Internal and External Constituencies 

West Virginia University at Parkersburg (WVU-P) is committed 
to the support of efforts that study, develop, evaluate, and 
implement strategies related to increasing institutional 
effectiveness* More recently, the focus has been on developing 
strategies to measure outcomes in the area of general education. 
Although there are a number of professionally developed 
instruments available to measure gains in general education, the 
instruments: the College Outcome Measures Program (C0MP) , the 
Collegiate Assessment of Academic Proficiency (GAAP) both from 
American College Testing (ACT) , the Academic Profile from 
Educational Testing Service (ETS) , and the College Basic Academic 
Subjects Examination (College BASE) from the Riverside Publishing 
Company (RPC) were considered for implementation at the WVU-P. 

Past studies indicated that both COMP and the Academic 
Profile are not appropriate for evaluating the impact of general 
education outcomes (Pike, 1988) * As for the instruments from 
ACT, both COMP and GAAP are not adequately reliable and are not 
sensitive to the institution's general education course work 
(Pike, 1998) . Welch (1989) stated that black students did not 
perform as well as did their white peers on CAAP. Doolittle 
(1989) found no overall performance differences on CAAP between 
males and females. Regardless of the drawbacks pointed out by 
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the researchers, many higher education institutions still adopted 
ACT-C0MP and/or CAAP as their assessment tools. As for the 
instrument from ETS, Thorndike (1992) found that the College 
BASE has a good test-retest reliability but also shows growth in 
subject areas. However, as pointed out by Thorndike, College 
BASE adds little information about entering students that is 
available from other sources. Stiggins (1985) stated that self- 
developed tests are far better in aiding teachers in their 
classroom-based assessment than standardized tests. However, due 
the emphasis of public awareness and accountability, WVU-P was to 
use a nationally recognized standardized instrument. 

A study conducted at WVU-P in early Spring, 1992 (Appendix 
A) indicated that the CAAP was the least preferred instrument 
among H general education core-curriculum" faculty. Due to 
financial reasons, WVU-P elected to use the Academic Profile and 
the College BASE in this pilot study. 

The pilot &tudy was conducted during the next to last week 
of April, 1992. The test instruments utilized were the short 
form of the Academic Profile (AP) from ETS and the short form of 
the College BASE from the RPC. The purpose of the pilot study 
was to find the most appropriate instrument that would accurately 
measure the gain between students who are incoming freshmen (zero 
credit hours earned) and near completers (earned 45 credit hours 
or more) in the general education curriculum. 
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Sample Selection 

WVU-P is a non-residential, two year institution with an 
enrollment of nearly 4,000 students, consisting of 63% females * 
and 2% minorities. The majority of the students are employed 
while they take classes and the mean age of the student body is 
26.6 years. Due to the nature of the commuting/working students, 
it is almost impossible to select a true random sample and expect 
the sampled students to take a test without monetary or tangible 
incentives, especially at the end of the semester. Therefore, 
WVU-P's Outcomes Assessment Committee (OAC) , with the consent of 
course instructors, selected four classes* for the pilot testing. 
Among the selected classes, there was a morning class, an evening 
class, and two afternoon classes. The total number of students 
tested was 117, with a gender distribution of 42.6% male, 57.4% 
female, and a mean age of 24.6 years. 
Pilot Testing 

The pilot tests were implemented by Phil McClung and HongYu 
Chen in the presence of the class instructor. The tests were 
administered using guidelines designed by the testing companies. 
Based on the observations, the majority of the students took the 
tests seriously and answered questions thoughtfully. However, 
because the results of these tests would not influence the 
students' grades in any way, it was felt that some students 
marked the answers without careful thinking, especially in the 
afternoon sessions. 
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Scoring 

The tests were scored by the testing companies. It took 
approximately four weeks to score AP and six weeks to score 
College BASE (CB) . 
Results 

The short form of the AP test includes English, math, 
natural and social sciences, with an administration time of 40 
minutes. The ETS reports a single score for each student. The 
mean score for AP (total of 69 students) was 444, with a standard 
deviation of 15.7. Compared to the mean score (443) of two-year 
colleges that had used AP nationwide. WVU-P's score is slightly 
higher the national norm* 

The short form of CB includes only English and math, with a 
60 minutes completion time. The mean CB score for English (total 
of 48 students) was 268, with a standard deviation of 64. The 
mean score for mathematics was 196, with a standard deviation of 
46. There were no national norms available for comparison. 
However, the RPC reports both students' scores as well as their 
proficiency level of high, medium, and low. The majority of WVU- 
P students scored within the levels of medium and low as shown in 
Table 1 and Figure !• 
Data Analysis 

All the analyses were done in the Office of Institutional 
Research at WVU-P. 

WVU-P, SACCR 23rd Annual Conference, August, 1994 




5 

Table 1 

Percentage of Studen t Who Reached in Each level for the College 
BASE 



Subject High Medium Low 



English 13.3% 51.3% 35.3% 

Math 2.1% 43.1% 54.7% 
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Analysis o n the Academic Profile 

In order to see whether AP showed a significant gain between 
incoming freshmen and near completers, the analysis of variance 
(ANOVA) was used to detect the significance of the mean 
difference. Because there were few students who had earned 
exactly zero and 45 hours, the grouping was completed based on: 
a) incoming freshmen - students who had earned 15 or fewer 
cumulative hours at the time of administration; b) near 
completers - students who had earned more than 45 but fewer than 
75 hours by April, 1992. There were 24 students in the incoming 
freshmen group and 18 students in the near completers group. 
Talcing into account the consideration of unequal group size (the 
larger variance is associated with the larger group) as shown in 
Table 2, the F statistic is robust (Stevens, 1990). 

Compared to students in other comprehensive colleges and 
universities, WVU-P freshmen were more academically disadvantaged 
(ETS, 1991) . However, compared with the sophomore student group, 
WVU-P students' mean score was higher than the scores of those 
colleges and universities (Figure 2) . This indicates that 
entering WVU-P students are below the national average in general 
education, but after completion of 45 credit hours, these 
students are above the national average. 

The ANOVA table presented in Table 3 shows no significant 
statistical difference between the mean score of incoming 
freshmen and near completers at .05 alpha level (although it is 
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Table 2 

Descriptive Information of Grouping for Academic Profile 



Group Mean Std Dev Min. Max. 



Incoming Freshmen (24) 439.9 16.8 413 467 

Near Completers (18) 449.6 13.0 428 476 



14 
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very close to be significant) . However, a near 10 point gain of 
the near completers group may have practical meaning although it 
is not statistically significant, A post hoc estimation of power 
based on the available sample presented in Table 4 indicates that 
if one were to look for a large gain the power was sufficient for 
detecting existing differences (power » 0,93) «, However, if one 
were interested in detecting small differences, the power would 
be insufficient (power ~ 0.10) based on the sample size selected. 

Analysis on the College BASE 

Three separated ANOVAs on English, math, and total score 
were implemented for testing the anticipated difference. 
However, due to a lack of "qualified" subjects, the grouping for 
these analyses was slightly different: a) incoming freshmen - 
students who had earned 15 or fewer cumulative hours when the 
test was implemented; b) near completers - students who had 
earned more than 45 hours by April, 1992. There were equal 
numbers of students (18) in both the incoming freshmen and the 
near completers groups. The descriptive information of the CB 
respondents is presented in Table 5 and Figure 3. 

Compared to the AP scores , CB shows a larger gain. However , 
the standard deviation for CB is much greater than AP. This 
could be the result of the difference in the seriousness of 
students who completed the test as well as the difference in 
backgrounds (general education) of our sampled students. 

The ANOVA results presented in Table 6 indicate that there 
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Table 3 

ANOVA Table for Academic Profile 



Source DF Sum of Squares Mean Squares F Value P 



Model 1 955.6 955.6 4.08 .0503 

Error 40 9380.3 234.5 

Total 41 10335.9 
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Table 4 

Post Hoc Power Estimation 



Effect Size Sample Size Harmonic Mean Power 



Small (0.1) 42 20.6 .10 

Medium (0.3) 42 20.6 .52 

Large (Q.5) 42 20.6 .93 



10 
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was no statistical significant difference between the mean score 
of incoming freshmen and near completers at .05 alpha level for 
all three scores (English, mathematics, and total) . 
Conclusion 

The Academic Profile, in comparison with the College BASE, 
was found to be a more consistent and statistically meaningful 
measure of differences between the incoming freshmen and the 
near-completers at WVU-P. This study adds credibility to a 
previous study conducted in which a majority of the WVU-P faculty 
rated the Academic Profile as preferred instrument to measure 
gains in general education studies* 



* The selected classes were Dr. Robert McCloy's business 381 and 
410, and two of Mr. Phil McClung's psychology 111 classes. 

WVU-P, SACCR 23rd Annual Conference, August, 1994 



ERIC 



20 



14 

Table 5 

Descriptive Information o f Grouping for College BASE 



Group Mean Std Dev Min. Max 
English 

Incoming Freshmen 247.8 68.0 126 352 

Near Completers 279.1 52.7 185 405 
Mathematics 

Incoming Freshmen 185.2 59.5 142 274 

Near Completers 207.6 50.8 117 304 
Total (English + Math)* 

Incoming Freshmen 443.1 100.1 268 626 

Near Completers 486.7 82.2 350 629 

* The addition for minimum and maximum may not be for the same 
person • 
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Table 6 

ANOVA Results for C ollege BASE 



Group Degree of Freedom T Value P Value 



English 1, 34 2.37 .13 

Mathematics 1, 34 1.47 .23 

Total 1, 33 2.32 .14 
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Appendix A - REPORT OF STUDENT OUTCOME ASSESSMENT TESTS ANALYSIS 

Given the time and expertise of department chairpersons and 
faculty members, we have completed an analysis of content quality/ 
suitability and difficulty for WVU-P students on three well known test 
sets (All three tests claim to be suitable for measuring student 
learning outcomes in a two-year college program) . The test sets 
examined were: 

Test set A - "Academic Profile 11 from Educational Testing Services, 
Test set B - "College BASE" from the Riverside Publishing Company, 
Test set C - "College Assessment of Academic Proficiency" (GAAP) from 
American College Testing Program (ACT) . The four subjects analyzed 
were: English, Mathematics, Natural Sciences, and Social Sciences. 

The ratings from faculty analyzing the test (total of 26, 10 from 
English, 3 from Mathematics, 5 from Natural Sciences, and 8 from Social 
Sciences Department) , indicates a preference for Academic Profile (Test 
set A) . The overall faculty rating j; on the three test sets are: 



Test 
Sets 


Quality/ Suitability 
New Student 45 Hrs Plus 


Test Difficulty 
New Student 45 Hrs Plus 


Faculty's 
Preference 


A 


3.12 2.87 


2.28 2.64 


First 


B 


3-08 3.04 


2.52 3.13 


Second 


C 


3.58 3.34 


1.91 2.57 


Third 



Note: The numbers reported above are weighted mean values based on the 
following criterion: 

a) Test Quality/Suitability 

1 =* Excellent 2 = Good 3 « Neutral 4 = Poor 5 = Very Poor, 

b) Test Difficulty 

1 = Too Difficult 2 ~ Somewhat difficult 3 = About: Right 
4 = Somewhat Easy 5 - Too Easy. 

The participating faculty indicated that test A (Academic Profile) 
is suitable for both our incoming freshmen (zero credit hour earned) 
and near completers (students near completion of the AA, AS, or AAS 
program - have earned more than 45 credit hours) . The faculty ratings 
on Academic Profile shifted from "somewhat difficult" for our incoming 
freshmen toward "about right" for our near completers. 

Some faculty members from the English, Mathematics, and Natural 
Sciences Departments commented that none of the three test sets are a 
perfect fit for WVU-P. Some of the test questions fell outside tne 
content of our curriculum or courses. Endeavors in searching for a 
more suitable testing instrument or developing our own test is 
recommended. However, the Board of Trustees has mandated student 
testing for outcome purposes. It is anticipated that one of the 
"nationally recognized" test instruments will be mandated in the near 
future. 
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The following is the test content information about each test 
(long form) . 



Test A - Academic Profile (test designed with three proficiency levels) 



Subject 


Percent 


No. of Items Test Time 


Humanities 

Literature 

Philosophy 

Music 

Art 

Film 


50% 
25% 

10% 
5% 


36 


Social Sciences 

History , Political Science, 

International Relations 
Behavioral Sciences 

(Psychology, Sociology, 

Urban Studies) 
Economics 
Anthropology 


47* 
31% 

2% 


36 

• 


Natural Sciences 
Biology* 
Chemistry 
jrnysics 

Experimental Finding 




36 


College-Level Reading 




36 


College-Level Writing 




36 


Critical Thinking 




36 


Mathematics 




36 


Total 




144** 2.5 hr. 



* In Natural Sciences, Biology weights more than Chemistry and 



** Reported from the user's manual, somehow it is not the summation 
of the above. 

Note: One can ask up to 50 locally written questions. 



WVU-P, SACCR 23rd Annual Conference, Augu»t, 1994 



9 

:RIC 



2H 



21 



Test B - College BASE (without 40 minutes writing exercise) 



Subject 



No* of Items* 



Test Time 



English 

Reading 6 Literature 
Writing 

Social Studies 
. History 
Social Sciences 

Science 

Laboratory & Field Work 
Fundamental Concepts 

Mathematics 

General Mathematics 

Algebra 

Geometry 

Total 



41 



42 



41 



56 



180 



* Approximation of the number of questions. 
Note: No locally written question can be added. 



40 min. 



40 min. 



40 min. 



40 min. 



160 min. 
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Test C - ACT/CAAP 



Subject 


No. 


of .Items 


Test 


Time 


Writing Skills 

Usage/Mechanics 
Punctuation 
Grammar 

Sentence structure 
Rhetorical Skills 
Strategy 
Organization 
Style 


/ X 


32 
40 


6 
8 
18 

15 
10 
15 




win 


Mathematics 
Algebra 

Pre & elementary algebra 
Intermediate algebra 

and coordinate -geometry 
Aavancea aigeora 
Trigonometry 
Calculus 


35 


27 

4 
4 


7 

10 
10 


50 


min. 


Reading 

Referring 

Reasoning 

Inferring 
Applying 


36 


7-9 
27-29 


22-26 

o c: 

J 3 


50 


min. 


Critical uninKing 

Analysis of an argument 

T?xra 1 n a +• i on an aT-crimi^nt 
^iVfl l uq Li.uii vji. ail ai y uiuciiL. 

Extension of an argument 




20 
6 




50 


min* 


Science Reasoning 

Data Representation 
Research Summaries 
Conflicting Viewpoints 


32 


15 
24 
6 




50 


min. 


Writing (Essay) 








50 


min. 


Total* 


« 






7 





* Each student can take any combination of test subject (s) , 

therefore , the total number of questions and testing time vary 
accordingly. 



Note: One can ask up to 9 locally written questions. 
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Stimuli: 1 - Academic Profile, 2 - College BASE, and 3 - ACT/CAAP. 
RUN TITLE: All Faculty (number of responses « 26) 
MATRIX OF INPUT PROPORTIONS: 

12 3 

1 .500 .615 .654 

2 .385 .500 .731 

3 .346 .269 .500 

MATRIX OF ORDERED PROPORTIONS: 





1 


2 


' 3 


1 


.500 


.615 


.654 


2 


.385 


.500 


.731 


3 


.346 


.269 


.500 



MATRIX OF ORDERED Z-SCORES: 

1 2 3 

1 .0000 .2924 .3934 

2 -.2924 .0000 .6128 

3 -.3934 -.6128 .0000 

STANDARD DEVIATION OF STIMULI: 

12 3 
1.857 .263 .880 

SCALE DISTANCE FROM STIMULUS AT RIGHT 

12 3 
.068 .328 

STIMULI NUMBER SCALE VALUE (Faculty Preference) 

1 .396 (The Academic Profile) 

2 .328 (The College BASE) 

3 .000 (The ACT/CAAP) 



The sample interpretation (of English faculty) is on the next 

page. 
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Stimuli: 1 - Academic Profile, 2 « College BASE, and 3 ■= ACT/CAAP. 
RUN TITLE: English Faculty (No. of response - 10) 
STIMULI NUMBER SCALE VALUE 



1 1.072 

2 .184 

3 .000 



INTERPRETATION: Re lative to stimuli #3 . 
the English faculty's preference for 
stimuli #1 is about 5 times stronger 
than that of the stimuli #2. 



RUN TITLE: Mathematics Faculty (No. of response - 3) 
STIMULI NUMBER SCALE VALUE 

2 1.166 

1 .583 

3 .000 

RUN TITLE: Natural Science Faculty (No. of response =5) 
STIMULI NUMBER SCALE VALUE 

2 .842 
1 .129 

3 .000 

RUN TITLE: Social Science Faculty (No. of response = 8) 
STIMULI NUMBER SCALE VALUE 

1 -.451 

2 -.225 

3 .000 

NOTE: IF ANY SCALE VALUE IS NEGATIVE, YOU MUST RECALCULATE ON THE BASIS 
OF A STIMULUS DIFFERENCE TABLE. THE TROUBLE IS DUE TO THE FACT THAT 
THE DATA ACTUALLY HAVE MORE THAN ONE DIMENSION. SOME PROPORTIONS WERE 
INCONSISTENT. 



WVU-P, SACCR 23rd Annual Conference, Augu»t, 1994 

1 



ERiC 



32 



25 



Because of the multi-dimensional ratings from the Social Sciences 
faculty, the ratings are re-calculated without the responses from the 
Social Sciences Department. 

RUN TITLE: All Faculty (without Social Sciences) 
PROGRAM INPUT PARAMETERS: 



NCOD * 
NS = 
NPS = 
NSP = 
NP = 
NITAP = 



4 

3 
0 
0 

18 (number of responses) 



MATRIX OF INPUT PROPORTIONS: 



1 

2 
3 



1 

.500 
.389 
.222 



2 

.611 
.500 
.222 



3 

.778 
.778 
.500 



MATRIX OF ORDERED PROPORTIONS: 



1 
2 
3 



1 

.500 
.389 
.222 



2 

. 611 
.500 
.222 



3 

.778 
.778 
. 500 



MATRIX OF ORDERED Z-SOORES: 

12 3 

1 .0000 .2819 .7621 

2 -.2819 .0000 .7621 

3 -.7621 -.7621 . .0000 

STANDARD DEVIATION OF STIMULI: 

12 3 
1.317 .653 1.029 

SCALE DISTANCE FROM STIMULUS AT RIGHT 



12 3 
.207 .757 

STIMULI NUMBER SCALE VALUE 



1 
2 
3 



.965 
.757 
.000 



(Faculty preference) 

(The Academic Profile) 
(The College BASE) 
(The ACT/CAAP) 
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